29 research outputs found

    Stereo and ToF Data Fusion by Learning from Synthetic Data

    Get PDF
    Time-of-Flight (ToF) sensors and stereo vision systems are both capable of acquiring depth information but they have complementary characteristics and issues. A more accurate representation of the scene geometry can be obtained by fusing the two depth sources. In this paper we present a novel framework for data fusion where the contribution of the two depth sources is controlled by confidence measures that are jointly estimated using a Convolutional Neural Network. The two depth sources are fused enforcing the local consistency of depth data, taking into account the estimated confidence information. The deep network is trained using a synthetic dataset and we show how the classifier is able to generalize to different data, obtaining reliable estimations not only on synthetic data but also on real world scenes. Experimental results show that the proposed approach increases the accuracy of the depth estimation on both synthetic and real data and that it is able to outperform state-of-the-art methods

    Exploiting Multiple Priors for Neural 3D Indoor Reconstruction

    Full text link
    Neural implicit modeling permits to achieve impressive 3D reconstruction results on small objects, while it exhibits significant limitations in large indoor scenes. In this work, we propose a novel neural implicit modeling method that leverages multiple regularization strategies to achieve better reconstructions of large indoor environments, while relying only on images. A sparse but accurate depth prior is used to anchor the scene to the initial model. A dense but less accurate depth prior is also introduced, flexible enough to still let the model diverge from it to improve the estimated geometry. Then, a novel self-supervised strategy to regularize the estimated surface normals is presented. Finally, a learnable exposure compensation scheme permits to cope with challenging lighting conditions. Experimental results show that our approach produces state-of-the-art 3D reconstructions in challenging indoor scenarios.Comment: Accepted at the British Machine Vision Conference (BMVC) 202

    Adversarial Learning and Self-Teaching Techniques for Domain Adaptation in Semantic Segmentation

    Full text link
    Deep learning techniques have been widely used in autonomous driving systems for the semantic understanding of urban scenes. However, they need a huge amount of labeled data for training, which is difficult and expensive to acquire. A recently proposed workaround is to train deep networks using synthetic data, but the domain shift between real world and synthetic representations limits the performance. In this work, a novel Unsupervised Domain Adaptation (UDA) strategy is introduced to solve this issue. The proposed learning strategy is driven by three components: a standard supervised learning loss on labeled synthetic data; an adversarial learning module that exploits both labeled synthetic data and unlabeled real data; finally, a self-teaching strategy applied to unlabeled data. The last component exploits a region growing framework guided by the segmentation confidence. Furthermore, we weighted this component on the basis of the class frequencies to enhance the performance on less common classes. Experimental results prove the effectiveness of the proposed strategy in adapting a segmentation network trained on synthetic datasets, like GTA5 and SYNTHIA, to real world datasets like Cityscapes and Mapillary.Comment: Accepted at IEEE Transactions on Intelligent Vehicles (T-IV) 10 pages, 2 figures, 7 table

    Intelligenza artificiale e sicurezza: opportunità, rischi e raccomandazioni

    Get PDF
    L'IA (o intelligenza artificiale) è una disciplina in forte espansione negli ultimi anni e lo sarà sempre più nel prossimo futuro: tuttavia è dal 1956 che l’IA studia l’emulazione dell’intelligenza da parte delle macchine, intese come software e in certi casi hardware. L’IA è nata dall’idea di costruire macchine che - ispirandosi ai processi legati all’intelligenza umana - siano in grado di risolvere problemi complessi, per i quali solitamente si ritiene che sia necessario un qualche tipo di ragionamento intelligente. La principale area di ricerca e applicazione attuale dell’IA è il machine learning (algoritmi che imparano e si adattano in base ai dati che ricevono), che negli ultimi anni ha trovato ampie applicazioni grazie alle reti neurali (modelli matematici composti da neuroni artificiali) che a loro volta hanno consentito la nascita del deep learning (reti neurali di maggiore complessità). Appartengono al mondo dell’IA anche i sistemi esperti, la visione artificiale, il riconoscimento vocale, l’elaborazione del linguaggio naturale, la robotica avanzata e alcune soluzioni di cybersecurity. Quando si parla di IA c'è chi ne è entusiasta pensando alle opportunità, altri sono preoccupati poiché temono tecnologie futuristiche di un mondo in cui i robot sostituiranno l'uomo, gli toglieranno il lavoro e decideranno al suo posto. In realtà l'IA è ampiamente utilizzata già oggi in molti campi, ad esempio nei cellulari, negli oggetti smart (IoT), nelle industry 4.0, per le smart city, nei sistemi di sicurezza informatica, nei sistemi di guida autonoma (drive o parking assistant), nei chat bot di vari siti web; questi sono solo alcuni esempi basati tutti su algoritmi tipici dell’intelligenza artificiale. Grazie all'IA le aziende possono avere svariati vantaggi nel fornire servizi avanzati, personalizzati, prevedere trend, anticipare le scelte degli utenti, ecc. Ma non è tutto oro quel che luccica: ci sono talvolta problemi tecnici, interrogativi etici, rischi di sicurezza, norme e legislazioni non del tutto chiare. Le organizzazioni che già adottano soluzioni basate sull’IA, o quelle che intendono farlo, potrebbero beneficiare di questa pubblicazione per approfondirne le opportunità, i rischi e le relative contromisure. La Community for Security del Clusit si augura che questa pubblicazione possa fornire ai lettori un utile quadro d’insieme di una realtà, come l’intelligenza artificiale, che ci accompagnerà sempre più nella vita personale, sociale e lavorativa.AI (or artificial intelligence) is a booming discipline in recent years and will be increasingly so in the near future.However, it is since 1956 that AI has been studying the emulation of intelligence by machines, understood as software and in some cases hardware. AI arose from the idea of building machines that-inspired by processes related to human intelligence-are able to solve complex problems, for which it is usually believed that some kind of intelligent reasoning is required. The main current area of AI research and application is machine learning (algorithms that learn and adapt based on the data they receive), which has found wide applications in recent years thanks to neural networks (mathematical models composed of artificial neurons), which in turn have enabled the emergence of deep learning (neural networks of greater complexity). Also belonging to the AI world are expert systems, computer vision, speech recognition, natural language processing, advanced robotics and some cybersecurity solutions. When it comes to AI there are those who are enthusiastic about it thinking of the opportunities, others are concerned as they fear futuristic technologies of a world where robots will replace humans, take away their jobs and make decisions for them. In reality, AI is already widely used in many fields, for example, in cell phones, smart objects (IoT), industries 4.0, for smart cities, cybersecurity systems, autonomous driving systems (drive or parking assistant), chat bots on various websites; these are just a few examples all based on typical artificial intelligence algorithms. Thanks to AI, companies can have a variety of advantages in providing advanced, personalized services, predicting trends, anticipating user choices, etc. But not all that glitters is gold: there are sometimes technical problems, ethical questions, security risks, and standards and legislation that are not entirely clear. Organizations already adopting AI-based solutions, or those planning to do so, could benefit from this publication to learn more about the opportunities, risks, and related countermeasures. Clusit's Community for Security hopes that this publication will provide readers with a useful overview of a reality, such as artificial intelligence, that will increasingly accompany us in our personal, social and working lives

    Data Driven Approaches for Depth Data Denoising

    Get PDF
    The scene depth is an important information that can be used to retrieve the scene geometry, a missing element in standard color images. For this reason, the depth information is usually employed in many applications such as 3D reconstruction, autonomous driving and robotics. The last decade has seen the spread of different commercial devices able to sense the scene depth. Among these, Time-of-Flight (ToF) cameras are becoming popular because they are relatively cheap and they can be miniaturized and implemented on portable devices. Stereo vision systems are the most widespread 3D sensors and they are simply composed by two standard color cameras. However, they are not free from flaws, in particular they fail when the scene has no texture. Active stereo and structured light systems have been developed to overcome this issue by using external light projectors. This thesis collects the findings of my Ph.D. research, which are mainly devoted to the denoising of depth data. First, some of the most widespread commercial 3D sensors are introduced with their strengths and limitations. Then, some techniques for the quality enhancement of ToF depth acquisition are presented and compared with other state-of-the-art methods. A first proposed method is based on a hardware modification of the standard ToF projector. A second approach instead uses multi-frequency ToF recordings as input of a deep learning network to improve the depth estimation. A particular focus will be given to how the denoising performance degrades, when the network is trained on synthetic data and tested on real data. Thus, a method to reduce the gap in performance will be proposed. Since ToF and stereo vision systems have complementary characteristics, the possibility to fuse the information coming from these sensors is analysed and a method based on a locally consistent fusion, guided by a learning based reliability measure for the two sensors, is proposed. A part of this thesis is dedicated to the description of the data acquisition procedures and the related labeling required to collect the datasets we used for the training and evaluation of the proposed methods.La profondità della scena è un importante informazione che può essere usata per recuperare la geometria della scena stessa, un elemento mancante nelle semplici immagini a colori. Per questo motivo, questi dati sono spesso usati in molte applicazioni come ricostruzione 3D, guida autonoma e robotica. L'ultima decade ha visto il diffondersi di diversi dispositivi capaci di stimare la profondità di una scena. Tra questi, le telecamere Time-of-Flight (ToF) stanno diventando sempre più popolari poiché sono relativamente poco costose e possono essere miniaturizzate e implementate su dispositivi portatili. I sistemi a visione stereoscopica sono i sensori 3D più diffusi e sono composti da due semplici telecamere a colori. Questi sensori non sono però privi di difetti, in particolare non riescono a stimare in maniera corretta la profondità di scene prive di texture. I sistemi stereoscopici attivi e i sistemi a luce strutturata sono stati sviluppati per risolvere questo problema usando un proiettore esterno. Questa tesi presenta i risultati che ho ottenuto durante il mio Dottorato di Ricerca presso l'Università degli Studi di Padova. Lo scopo principale del mio lavoro è stato quello di presentare metodi per il miglioramento dei dati 3D acquisiti con sensori commerciali. Nella prima parte della tesi i sensori 3D più diffusi verranno presentati introducendo i loro punti di forza e debolezza. In seguito verranno descritti dei metodi per il miglioramento della qualità dei dati di profondità acquisiti con telecamere ToF. Un primo metodo sfrutta una modifica hardware del proiettore ToF. Il secondo utilizza una rete neurale convoluzionale (CNN) che sfrutta dati acquisiti da una telecamera ToF per stimare un'accurata mappa di profondità della scena. Nel mio lavoro è stata data attenzione a come le prestazioni di questo metodo peggiorano quando la CNN è allenata su dati sintetici e testata su dati reali. Di conseguenza, un metodo per ridurre tale perdita di prestazioni verrà presentato. Poiché le mappe di profondità acquisite con sensori ToF e sistemi stereoscopici hanno proprietà complementari, la possibilità di fondere queste due sorgenti di informazioni è stata investigata. In particolare, è stato presentato un metodo di fusione che rinforza la consistenza locale dei dati e che sfrutta una stima dell'accuratezza dei due sensori, calcolata con una CNN, per guidare il processo di fusione. Una parte della tesi è dedita alla descrizione delle procedure di acquisizione dei dati utilizzati per l'allenamento e la valutazione dei metodi presentati

    Data Driven Approaches for Depth Data Denoising

    Get PDF
    The scene depth is an important information that can be used to retrieve the scene geometry, a missing element in standard color images. For this reason, the depth information is usually employed in many applications such as 3D reconstruction, autonomous driving and robotics. The last decade has seen the spread of different commercial devices able to sense the scene depth. Among these, Time-of-Flight (ToF) cameras are becoming popular because they are relatively cheap and they can be miniaturized and implemented on portable devices. Stereo vision systems are the most widespread 3D sensors and they are simply composed by two standard color cameras. However, they are not free from flaws, in particular they fail when the scene has no texture. Active stereo and structured light systems have been developed to overcome this issue by using external light projectors. This thesis collects the findings of my Ph.D. research, which are mainly devoted to the denoising of depth data. First, some of the most widespread commercial 3D sensors are introduced with their strengths and limitations. Then, some techniques for the quality enhancement of ToF depth acquisition are presented and compared with other state-of-the-art methods. A first proposed method is based on a hardware modification of the standard ToF projector. A second approach instead uses multi-frequency ToF recordings as input of a deep learning network to improve the depth estimation. A particular focus will be given to how the denoising performance degrades, when the network is trained on synthetic data and tested on real data. Thus, a method to reduce the gap in performance will be proposed. Since ToF and stereo vision systems have complementary characteristics, the possibility to fuse the information coming from these sensors is analysed and a method based on a locally consistent fusion, guided by a learning based reliability measure for the two sensors, is proposed. A part of this thesis is dedicated to the description of the data acquisition procedures and the related labeling required to collect the datasets we used for the training and evaluation of the proposed methods

    Combination of spatially-modulated ToF and structured light for MPI-free depth estimation

    No full text
    Multi-path Interference (MPI) is one of the major sources of error in Time-of-Flight (ToF) camera depth measurements. A possible solution for its removal is based on the separation of direct and global light through the projection of multiple sinusoidal patterns. In this work we extend this approach by applying a Structured Light (SL) technique on the same projected patterns. This allows to compute two depth maps with a single ToF acquisition, one with the Time-of-Flight principle and the other with the Structured Light principle. The two depth fields are finally combined using a Maximum-Likelihood approach in order to obtain an accurate depth estimation free from MPI error artifacts. Experimental results demonstrate that the proposed method has very good MPI correction properties with state-of-the-art performances

    Material Identification Using RF Sensors and Convolutional Neural Networks

    No full text

    Unsupervised Domain Adaptation for ToF Data Denoising with Adversarial Learning

    No full text
    Time-of-Flight data is typically affected by a high level of noise and by artifacts due to Multi-Path Interference (MPI). While various traditional approaches for ToF data improvement have been proposed, machine learning techniques have seldom been applied to this task, mostly due to the limited availability of real world training data with depth ground truth. In this paper, we avoid to rely on labeled real data in the learning framework. A Coarse-Fine CNN, able to exploit multi-frequency ToF data for MPI correction, is trained on synthetic data with ground truth in a supervised way. In parallel, an adversarial learning strategy, based on the Generative Adversarial Networks (GAN) framework, is used to perform an unsupervised pixel-level domain adaptation from synthetic to real world data, exploiting unlabeled real world acquisitions. Experimental results demonstrate that the proposed approach is able to effectively denoise real world data and to outperform state of-the-art techniques
    corecore